1 Preparatory steps for the Markdown file

Setting the options for knitr

library(knitr)
knitr::opts_chunk$set(echo = TRUE,
                       comment = NA,
                       prompt  = FALSE,
                       cache   = FALSE,
                       warning = FALSE,
                       message = FALSE,
                       fig.align="center",
                       fig.width = 8.125,
                       out.width = "100%",
                       fig.path = "D:/figures/eL2/Part2-lowres/",
                       dev = c('png', 'tiff'),
                       dev.args = list(png = list(type = "cairo"), tiff = list(compression = 'lzw')),
                       dpi=150,
                       cache=TRUE,
                       cache.path = "D:/cache/eL2/Part2-lowres/",
                       autodep=TRUE)
options(width = 1000, scipen = 999, knitr.kable.NA = '')

Setting the options for summarytools

library(summarytools)
st_options(plain.ascii = FALSE,        # Always use this option in Rmd documents
            style        = "rmarkdown", # Always use this option in Rmd documents
            footnote     = NA,          # Makes html-rendered results more concise
            subtitle.emphasis = FALSE)  # Improves layout with some rmarkdown themes
st_css()

Loading libraries…

library(tidyverse)
library(ggplot2) # Figures
library(gridExtra)
library(kableExtra)
library(lme4) # For lmer and lme
library(lmerTest) # with p-values in the summary() and anova() functions
library(emmeans) # multiple comparisons and plotting effects
library(FSA) # For Dunn's test
library(stats) # For Fligner-Killeen's test
library(DHARMa) # simulate residuals
library(sjPlot) # Plotting effects
library(glmmTMB)
library(stargazer)

Cleaning the memory:

rm(list = ls(all.names = TRUE))

Specifying a seed for random number generation, for the sake of reproducibility:

set.seed(123)

2 Mixed-effects logistic regression (bilingual subjects)

2.1 Loading the model and the processed dataset

We load what we saved previously:

load(file = "model.final.RData")

As a reminder, our model is a logistic regression, with a number of fixed effects, which include several interactions, and a random-effects structure which consists of i) random intercepts for nonword and L1 and ii) a random categorical slope for subject, with three sets of values for the three possibles values of occurrence_l (whether there is no l in the nonword to repeat, an l internal coda position or an l in final position).

2.2 Assessing the validity of our ‘best’ model

In order to trust the outputs of our model, we need to make sure that the assumptions underlying a mixed-effects logistic regression are met. While these assumptions are not as clear as for linear regression, the following conditions matter: - The normality of the different distributions for the random effects - The correct distribution of the residuals of the model

(Note that we have already checked that our model is not singular)

2.2.1 Normality of the levels of the random effects

p <- plot_model(model.final, type = "diag", show.values = TRUE, value.offset = .3)
p$nonword + labs(title = "Random intercepts for nonword")

p$L1 + labs(title = "Random intercepts for L1")

p$subject + labs(title = "Random slopes for subject", subtitle = "occurrence_lcoda: l in internal coda position -- occurrence_lfinal: l in final position -- occurrence_lother: other nonwords") + theme_sjplot()

The quantile-quantile plots of the various random effects indicate that their levels follow a distribution which is quite close to a Gaussian distribution.

2.2.2 Residuals of the model

There is no expected normality of the model residuals with a logistic regression, but we can simulate residuals with the function simulateResiduals() of the DHARMa package. According to the authors of the package: ‘Even experienced statistical analysts currently have few options to diagnose misspecification problems in GLM. DHARMa package aims at solving these problems by creating readily interpretable residuals for generalized linear models that are standardized to values between 0 and 1, and that can be interpreted as intuitively as residuals for the linear model. This is achieved by a simulation-based approach that transforms the residuals to a standardized scale. The key idea is that, if the model is correctly specified, then the observed data should look like as if it was created from the fitted model. Hence, for a correctly specified model, all values of the cumulative distribution should appear with equal probability. That means we expect the distribution of the residuals to be flat, regardless of the binomial model structure.’ — Source: DHARMa: residual diagnostics for hierarchical (multi-level/mixed) regression models

sim.residuals <- simulateResiduals(fittedModel = model.final, n = 1000, refit = FALSE)
plotSimulatedResiduals(simulationOutput = sim.residuals)

According to the graphics, the simulated residuals do not reveal any misspecification of the model - they are uniformly and homogeneously distributed with respect to the model predictions (right graphic), and they also appear to be linearly distributed.

We can also inspect the residuals not with respect to the model predictions, but to the different main fixed effects.

plotResiduals(sim.residuals, form = df$occurrence_l, xlab = "Occurrence of l", ylab = "Simulated Standardized Residuals")

plotResiduals(sim.residuals, form = df$branching_onset, xlab = "branching_onset", ylab = "Simulated Standardized Residuals")

plotResiduals(sim.residuals, form = df$V, xlab = "V", ylab = "Simulated Standardized Residuals", asFactor = F)

We do not see any violation of the uniformity and homogeneity of the distribution of the simulated residuals for our different categorical predictors.

plotResiduals(sim.residuals, form = df$rec_voc, xlab = "rec_voc", ylab = "Simulated Standardized Residuals")

plotResiduals(sim.residuals, form = df$phono_mem, xlab = "phono_mem", ylab = "Simulated Standardized Residuals", asFactor = F)

plotResiduals(sim.residuals, form = df$LoE, xlab = "LoE", ylab = "Simulated Standardized Residuals")

plotResiduals(sim.residuals, form = df$age, xlab = "age", ylab = "Simulated Standardized Residuals")

We also do not see any violation of the uniformity and homogeneity of the distribution of the simulated residuals for our different numerical predictors.

2.2.3 Conclusions regarding the assumptions of the model

The key assumptions of our mixed-effects logistic regression are satisfied. We can therefore reasonably trust the outputs of the model, which we analyze in the next section.

2.3 Analyzing our ‘best’ model

We initially considered a number of fixed effects to be assessed with our statistical model.

There are a number of main effects:

  • occurrence_l (whether l appears in the nonword) (‘coda’ if l appears in internal coda position, ‘final’ if l appears in final position, ‘other’ otherwise)
  • branching_onset (number of branching onsets in the nonword) (0, 1 or 2)
  • V (number of vowels of the nonword) (1, 2 or 3)
  • rec_voc (size of a child’s French receptive vocabulary)
  • phono_mem (size of a child’s phonological memory)
  • LoE (length of a child’s exposure to French, in days)
  • age (child’s age, in months)

There are also a number of interactions:

  • occurrence_l * LoE
  • branching_onset * LoE
  • rec_voc * LoE
  • age * phono_mem
  • age * rec_voc
  • phono_mem * rec_voc

We formulate precise oriented hypotheses to be tested for the main effects (all other things being equal):

  • Regarding occurrence_l: a nonword gets easier to repeat when shifting from l in coda position to l in final position to another structure
  • Regarding branching_onset: the more branching onsets a nonword has, the more difficult it is to repeat
  • Regarding V: following the litterature, a nonword becomes more difficult to repeat when it has 3 vowels (no difference between 1 and 2 vowels)
  • Regarding, rec_voc: the larger a child’s French receptive vocabulary, the easier it is for them to repeat the nonwords
  • Regarding phono_mem: the larger a child’s phonological memory, the easier it is for them to repeat the nonwords
  • Regarding LoE: the longer the exposure to French, the easier it is for them to repeat the nonwords
  • Regarding age: the older the children, the easier it is for them to repeat the nonwords

2.3.1 Analyzing the fixed effects

In the next paragraphs, one should note that the graphical representations report probabilities of correct repetition of a nonword, i.e., in the ‘numerical space’ of the predicted variable. However, the statistical tests of the various effects consider the non-transformed estimates and confidence intervals, i.e. in the numerical space of the linear combination of predictors / independent variables.

We rely on the emmeans package and the computation of estimated marginal means of either levels of factors or linear trends. To adjust the p-values in the case of multiple tests, we rely on a multivariate t distribution (adjust = “mvt”) with the same covariance structure as the estimates, as it is the closest to an exact adjustment (see https://cran.r-project.org/web/packages/emmeans/vignettes/confidence-intervals.html).

2.3.1.1 Investigating the interactions

We can first investigate the higher-level predictors of the model, that is the different interactions.

  • occurrence_l * LoE
  • branching_onset * LoE
  • rec_voc * LoE
  • age * phono_mem
  • age * rec_voc
  • phono_mem * rec_voc

The first two interactions are between one categorical variable and the continuous variable LoE (occurrence_l : LoE and branching_onset : LoE)

We first define a range of variation for the values of LoE

list.LoE <- list(LoE = seq(min(df$LoE), max(df$LoE), by = 1))

For occurrence_l : LoE:

plot_model(model.final, type = "emm", terms = c("LoE [all]", "occurrence_l"))

emtrends(model.final, pairwise ~ occurrence_l | LoE, var= "LoE", adjust = "mvt", infer = c(T,T))
$emtrends
LoE = 180:
 occurrence_l LoE.trend      SE  df asymp.LCL asymp.UCL z.ratio p.value
 coda          0.000987 0.00347 Inf  -0.00582   0.00779   0.284  0.7761
 final         0.004485 0.00342 Inf  -0.00222   0.01119   1.311  0.1899
 other         0.000346 0.00144 Inf  -0.00248   0.00317   0.240  0.8103

Results are averaged over the levels of: branching_onset, V 
Confidence level used: 0.95 

$contrasts
LoE = 180:
 contrast       estimate      SE  df asymp.LCL asymp.UCL z.ratio p.value
 coda - final  -0.003498 0.00449 Inf  -0.01389   0.00690  -0.780  0.7039
 coda - other   0.000641 0.00290 Inf  -0.00607   0.00735   0.221  0.9719
 final - other  0.004139 0.00346 Inf  -0.00389   0.01217   1.195  0.4414

Results are averaged over the levels of: branching_onset, V 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 

The figures show that the slopes for the levels other and coda are quite close, and are visually quite different from the slope for the third level final. The statistical tests, however, fail to detect a significant difference. The reason for that is possibly the large standard errors for coda and final, which are due in turn to the small number of nonwords which relate to these two levels (4 and 7 respectively, to be compared to 60 nonwords without l in coda or final position):

my_table <- df %>%
  select(nonword, occurrence_l) %>%
  unique() %>%
  with(., table(occurrence_l)) %>%
  as.data.frame()
colnames(my_table) <- c("Occurrence of l", "# nonwords")
my_table
  Occurrence of l # nonwords
1            coda          4
2           final          7
3           other         60

For branching_onset : LoE:

plot_model(model.final, type = "emm", terms = c("LoE [all]", "branching_onset"))

emtrends(model.final, pairwise ~ branching_onset | LoE, var= "LoE", adjust = "mvt", infer = c(T,T))
$emtrends
LoE = 180:
 branching_onset LoE.trend      SE  df asymp.LCL asymp.UCL z.ratio p.value
 0                0.002483 0.00189 Inf  -0.00122   0.00618   1.316  0.1883
 1                0.000658 0.00201 Inf  -0.00329   0.00460   0.327  0.7439
 2                0.002678 0.00308 Inf  -0.00335   0.00871   0.870  0.3842

Results are averaged over the levels of: occurrence_l, V 
Confidence level used: 0.95 

$contrasts
LoE = 180:
 contrast                             estimate      SE  df asymp.LCL asymp.UCL z.ratio p.value
 branching_onset0 - branching_onset1  0.001825 0.00111 Inf -0.000745   0.00440   1.637  0.2162
 branching_onset0 - branching_onset2 -0.000195 0.00256 Inf -0.006101   0.00571  -0.076  0.9966
 branching_onset1 - branching_onset2 -0.002020 0.00254 Inf -0.007881   0.00384  -0.795  0.6938

Results are averaged over the levels of: occurrence_l, V 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 

While the figures show that the slopes for LoE differ according to the value of branching_onset, the statistical tests reveal that these slopes are actually not significantly different from each other.

We then have 4 interactions which consist of two continuous variables : rec_voc : LoE, age : phono_mem, months : rec_voc and phono_mem : rec_voc

For rec_voc : LoE:

plot_model(model.final, type = "emm", terms = c("LoE [all]", "rec_voc"))

We can see that the different curves are nearly parallel, which corroborates the lack of significant interaction between the two variables.

Knowing that the interaction between rec_voc and LoE corresponds to the change in the slope of LoE for every one unit increase in rec_voc (or vice-versa), we can assess the significance of this interaction by looking at the contrast / difference between two slopes of rec_voc separated by one unit increase in LoE (the result will be independent from the choice of the two values separated by one unit).

list.LoE.red <- list(LoE = seq(median(df$LoE) - 0.5, median(df$LoE) + 0.5, by = 1))
emtrends(model.final, pairwise ~ LoE, var = "rec_voc", at = list.LoE.red, adjust = "mvt", infer = c(T, T))
$emtrends
 LoE rec_voc.trend     SE  df asymp.LCL asymp.UCL z.ratio p.value
 180        0.0721 0.0165 Inf    0.0398     0.104   4.373  <.0001
 180        0.0722 0.0165 Inf    0.0398     0.105   4.368  <.0001

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

$contrasts
 contrast             estimate       SE  df asymp.LCL asymp.UCL z.ratio p.value
 LoE179.5 - LoE180.5 -0.000127 0.000193 Inf -0.000506  0.000253  -0.655  0.5126

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

We find that the estimate for the interaction is not significantly different from 0 - the p-value is much larger than 0.05.

For age : phono_mem:

plot_model(model.final, type = "emm", terms = c("age [all]", "phono_mem"))

list.phono_mem.red <- list(phono_mem = seq(median(df$phono_mem) - 0.5, median(df$phono_mem) + 0.5, by = 1))
emtrends(model.final, pairwise ~ phono_mem, var = "age", at = list.phono_mem.red, adjust = "mvt", infer = c(T, T))
$emtrends
 phono_mem age.trend     SE  df asymp.LCL asymp.UCL z.ratio p.value
       3.5    0.0131 0.0103 Inf  -0.00715    0.0333   1.267  0.2052
       4.5    0.0201 0.0127 Inf  -0.00493    0.0450   1.573  0.1157

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

$contrasts
 contrast                    estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 phono_mem3.5 - phono_mem4.5 -0.00699 0.012 Inf   -0.0306    0.0166  -0.580  0.5617

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

Once again, the p-value is much larger than 0.05.

For age : rec_voc:

plot_model(model.final, type = "emm", terms = c("age [all]", "rec_voc"))

list.rec_voc.red <- list(rec_voc = seq(median(df$rec_voc) - 0.5, median(df$rec_voc) + 0.5, by = 1))
emtrends(model.final, pairwise ~ rec_voc, var = "age", at = list.rec_voc.red, adjust = "mvt", infer = c(T, T))
$emtrends
 rec_voc age.trend      SE  df asymp.LCL asymp.UCL z.ratio p.value
    19.5    0.0153 0.00968 Inf  -0.00365    0.0343   1.583  0.1135
    20.5    0.0159 0.00954 Inf  -0.00275    0.0346   1.671  0.0947

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

$contrasts
 contrast                   estimate      SE  df asymp.LCL asymp.UCL z.ratio p.value
 rec_voc19.5 - rec_voc20.5 -0.000624 0.00148 Inf  -0.00352   0.00227  -0.423  0.6726

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

The p-value for the interaction is close to 1.

For phono_mem : rec_voc:

plot_model(model.final, type = "emm", terms = c("phono_mem", "rec_voc"))

list.phono_mem.red <- list(phono_mem = seq(median(df$phono_mem) - 0.5, median(df$phono_mem) + 0.5, by = 1))
emtrends(model.final, pairwise ~ phono_mem, var = "rec_voc", at = list.phono_mem.red, adjust = "mvt", infer = c(T, T))
$emtrends
 phono_mem rec_voc.trend     SE  df asymp.LCL asymp.UCL z.ratio p.value
       3.5        0.0738 0.0185 Inf    0.0375     0.110   3.983  0.0001
       4.5        0.0691 0.0204 Inf    0.0292     0.109   3.393  0.0007

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

$contrasts
 contrast                    estimate     SE  df asymp.LCL asymp.UCL z.ratio p.value
 phono_mem3.5 - phono_mem4.5  0.00472 0.0205 Inf   -0.0355     0.045   0.230  0.8181

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 

The p-value is once again much larger than 0.05.

Our investigation of the interactions there shows that none of them is statistically significant. A possible option would then be to simplify the model by dropping these interactions. However, this amounts to model selection (for the fixed effects), which is warned against by a number of prominent statisticians. In what follows, we are therefore going to assess main effects despite the presence of interactions in the model.

2.3.1.2 Investigating the main effects

Given that none of the interactions we thought could be significant appears to be so, we can focus on the main effects in our model, i.e. the effects of the item-related categorical variables occurrence_l, branching_onset, and V, and the subject-related continous variables rec_voc, phono_mem, LoE and age.

For occurrence_l:

plot_model(model.final, type = "emm", terms = "occurrence_l")

summary(emmeans(model.final, pairwise ~ occurrence_l, adjust = "mvt", side = "<"), infer = c(TRUE, TRUE), null = 0)$contrasts
 contrast      estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 coda - final    -0.406 0.517 Inf      -Inf     0.669  -0.786  0.4699
 coda - other    -1.118 0.389 Inf      -Inf    -0.309  -2.874  0.0059
 final - other   -0.711 0.362 Inf      -Inf     0.042  -1.963  0.0647

Results are averaged over the levels of: branching_onset, V 
Results are given on the log odds ratio (not the response) scale. 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 
P values are left-tailed 

We set the parameter side to reflect our hypothesis that a nonword gets easier to repeat when shifting from l in coda position to l in final position to another structure.

We find that:

  • nonwords are significantly more difficult to repeat when l appears in internal coda position than when there is no l (coda - other, p = 0.0059).
  • nonwords are not significantly more difficult to repeat when l appears in final position than when there is no l (final - other), although we are close to significance (p = 0.0647)
  • nonwords are not significantly more difficult to repeat when l appears in internal coda position than when l appears in final position (coda - final)

For branching_onset:

plot_model(model.final, type = "emm", terms = "branching_onset")

summary(emmeans(model.final, pairwise ~ branching_onset, adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)$contrasts
 contrast                            estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 branching_onset0 - branching_onset1    0.877 0.176 Inf     0.521       Inf   4.974  <.0001
 branching_onset0 - branching_onset2    1.989 0.481 Inf     1.018       Inf   4.133  <.0001
 branching_onset1 - branching_onset2    1.113 0.488 Inf     0.128       Inf   2.279  0.0266

Results are averaged over the levels of: occurrence_l, V 
Results are given on the log odds ratio (not the response) scale. 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 
P values are right-tailed 

We set the parameter side to reflect our hypothesis that the more branching onsets in a nonword, the more difficult it is to repeat.

We observe that all the contrasts are significant, and that therefore:

  • A nonword is more difficult to repeat when it has 1 branching onset than when it has none (p < 0.0001)
  • A nonword is more difficult to repeat when it has 2 branching onsets than when it has none (p < 0.0001)
  • A nonword is more difficult to repeat when it has 2 branching onsets than when it has 1 (p = 0.0266)

For V:

plot_model(model.final, type = "emm", terms = "V")

summary(emmeans(model.final, pairwise ~ V, adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)$contrasts
 contrast estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 V1 - V2     0.144 0.203 Inf   -0.2792       Inf   0.707  0.4905
 V1 - V3     0.616 0.208 Inf    0.1825       Inf   2.956  0.0044
 V2 - V3     0.472 0.206 Inf    0.0427       Inf   2.287  0.0303

Results are averaged over the levels of: occurrence_l, branching_onset 
Results are given on the log odds ratio (not the response) scale. 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 
P values are right-tailed 

The parameter side corresponds to the hypothesis that the less vowels in a nonword, the easier it is to repeat.

We find two significant differences: a nonword with a single vowel is easier to repeat than a nonword with 3 vowels (p = 0.0044), and a nonword with 2 vowels is easier to repeat than a nonword with 3 vowels (p = 0.0303)

For rec_voc:

plot_model(model.final, type = "emm", terms = "rec_voc [all]")

summary(emtrends(model.final, ~ rec_voc, var = "rec_voc", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
 rec_voc rec_voc.trend     SE  df asymp.LCL asymp.UCL z.ratio p.value
    19.7        0.0722 0.0165 Inf     0.045       Inf   4.371  <.0001

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 
P values are right-tailed 

We set the parameter side to reflect our hypothesis that the larger a child’s French receptive vocabulary, higher the probability of correct repetition.

We observe a significant effect of rec_voc (p < 0.0001): the size of a child’s receptive vocabulary significantly and positively impact its ability to correctly repeat nonwords.

For phono_mem:

plot_model(model.final, type = "emm", terms = "phono_mem")

summary(emtrends(model.final, ~ phono_mem, var = "phono_mem", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
 phono_mem phono_mem.trend   SE  df asymp.LCL asymp.UCL z.ratio p.value
      3.84           0.127 0.12 Inf    -0.071       Inf   1.054  0.1459

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 
P values are right-tailed 

We set the parameter side to reflect our hypothesis that the larger a child’s phonological memory, the higher the probability of correct repetition.

We do not observe a significant effect of phono_mem.

For LoE:

plot_model(model.final, type = "emm", terms = "LoE [all]")

summary(emtrends(model.final, ~ LoE, var = "LoE", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
 LoE LoE.trend      SE  df asymp.LCL asymp.UCL z.ratio p.value
 180   0.00194 0.00203 Inf   -0.0014       Inf   0.957  0.1694

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 
P values are right-tailed 

We set the parameter side to reflect our hypothesis that the longer the exposure to French, the higher the probability of correct repetition.

We do not observe a significant effect of LoE on the probability of correct repetition.

For age:

plot_model(model.final, type = "emm", terms = "age [all]")

summary(emtrends(model.final, ~ age, var = "age", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
  age age.trend      SE  df asymp.LCL asymp.UCL z.ratio p.value
 89.6    0.0154 0.00964 Inf -0.000422       Inf   1.601  0.0547

Results are averaged over the levels of: occurrence_l, branching_onset, V 
Confidence level used: 0.95 
P values are right-tailed 

We set the parameter side to reflect our hypothesis that the older a child is, the higher the probability of correct repetition.

We do not observe a significant effect of age, although we are very close to significance (p = 0.0547).

2.3.1.3 Summary for the main effects

All our hypotheses were not confirmed. We found the following results:

  • nonwords are significantly more difficult to repeat when l appears in internal coda position than when there is no l (p = 0.0056).
  • the more branching onsets nonwords have, the more difficult they are to repeat (1 versus 0 BO: p < 0.0001; 2 versus 0 BO: p < 0.0001; 2 versus 1 BO: p = 0.0264)
  • nonwords with 3 vowels are more difficult to repeat than nonwords with 1 or 2 vowels (p = 0.0047 and p = 0.0330, respectively)
  • the larger a child’s French receptive vocabulary, the easier for them to repeat the nonwords (p < 0.0001)

Additionally, we observed two statistical tendencies:

  • nonwords where l appears in final position are easier to repeat than nonwords without l (p = 0.0652)
  • the older a child, the easier for them to repeat the nonwords (p = 0.0547)

We did not observe effects of phono_mem or of LoE. We also did not find a significant difference between l appearing in internal coda position in nonwords and l appearing in final position.

Our analysis overall proves to be quite simple, in the sense that we did not observe any significant interaction.

p1 <- plot_model(model.final, type = "emm", terms = "occurrence_l", show.values = TRUE, value.offset = .3) +
  labs(title = "Effect of occurence_l", x = "Location of /l/", y = "Predicted probability of correct repetition")
p2 <- plot_model(model.final, type = "emm", terms = "branching_onset", show.values = TRUE, value.offset = .3) +
  labs(title = "Effect of branching_onset", x = "Number of branching onsets", y = "Predicted probability of correct repetition")
p3 <- plot_model(model.final, type = "emm", terms = "V", show.values = TRUE, value.offset = .3) +
  labs(title = "Effect of V", x = "Number of vowels", y = "Predicted probability of correct repetition")
p4 <- plot_model(model.final, type = "emm", terms = "rec_voc [all]", show.values = TRUE, value.offset = .3) +
  labs(title = "Effect of rec_voc", x = "Size of the receptive vocabulary", y = "Predicted probability of correct repetition")
p5 <- plot_model(model.final, type = "emm", terms = "age [all]", show.values = TRUE, value.offset = .3) +
  labs(title = "Effect of age", x = "Age", y = "Predicted probability of correct repetition")

grid.arrange(p1, p2, p3, p4, p5, ncol = 3)

2.3.2 Analyzing the random effects

While our analysis is centered on fixed effects, it is interesting to pay attention to the distribution of values of the different random effects in our model.

p <- plot_model(model.final, type="re", sort.est="sort.all", grid=F, transform = NULL)

The most interesting source of information is likely the distribution of the random intercepts of the different children’s L1:

p[[5]] + scale_color_grey()

We can observe that the distribution of values suggests that being bilingual before learning French helps with the repetition task, since all but one case of bilingualism correspond to positive intercepts, i.e., a facilitatory effect for the task (the negative intercept for Russian/Chechen is also close to 0).

We can then display the sorted distribution of values of the random intercepts for nonwords:

p[[1]] + scale_color_grey()

There is no striking element in the distribution observed.

Finally, we can display, although they are difficult to interpret, the values of the random effects for subject and the three levels of the variable occurrence_l:

grid.arrange(p[[4]] + scale_color_grey(), p[[3]] + scale_color_grey(), p[[2]] + scale_color_grey(), ncol = 2)

We can observe that the variance of the values is larger when l appear in internal coda position or in final position, as it appears explicitly in the summary of the model:

summary(model.final)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: repetition ~ occurrence_l + branching_onset + V + scale(LoE) +      scale(age) + scale(rec_voc) + scale(phono_mem) + occurrence_l:scale(LoE) +      branching_onset:scale(LoE) + scale(rec_voc):scale(LoE) +      scale(age):scale(phono_mem) + scale(age):scale(rec_voc) +      scale(rec_voc):scale(phono_mem) + (0 + occurrence_l | subject) +      (1 | L1) + (1 | nonword)
   Data: df
Control: ctrl

     AIC      BIC   logLik deviance df.resid 
  4043.5   4216.1  -1994.8   3989.5     4375 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.6570  0.1800  0.3274  0.4927  2.3401 

Random effects:
 Groups  Name              Variance Std.Dev. Corr       
 nonword (Intercept)       0.3337   0.5777              
 subject occurrence_lcoda  2.3767   1.5417              
         occurrence_lfinal 2.7251   1.6508    0.08      
         occurrence_lother 0.2893   0.5379    0.84 -0.09
 L1      (Intercept)       0.1507   0.3882              
Number of obs: 4402, groups:  nonword, 71; subject, 62; L1, 18

Fixed effects:
                                Estimate Std. Error z value    Pr(>|z|)    
(Intercept)                      1.20941    0.44306   2.730     0.00634 ** 
occurrence_lfinal                0.40631    0.51722   0.786     0.43212    
occurrence_lother                1.11752    0.38884   2.874     0.00405 ** 
branching_onset1                -0.87664    0.17626  -4.974 0.000000657 ***
branching_onset2                -1.98915    0.48126  -4.133 0.000035771 ***
V2                              -0.14396    0.20348  -0.707     0.47926    
V3                              -0.61557    0.20827  -2.956     0.00312 ** 
scale(LoE)                       0.12135    0.26340   0.461     0.64500    
scale(age)                       0.15426    0.09635   1.601     0.10936    
scale(rec_voc)                   0.46160    0.10561   4.371 0.000012382 ***
scale(phono_mem)                 0.09446    0.08959   1.054     0.29172    
occurrence_lfinal:scale(LoE)     0.27739    0.35571   0.780     0.43550    
occurrence_lother:scale(LoE)    -0.05084    0.22958  -0.221     0.82475    
branching_onset1:scale(LoE)     -0.14471    0.08839  -1.637     0.10159    
branching_onset2:scale(LoE)      0.01546    0.20311   0.076     0.93934    
scale(LoE):scale(rec_voc)        0.06426    0.09813   0.655     0.51260    
scale(age):scale(phono_mem)      0.05203    0.08966   0.580     0.56171    
scale(age):scale(rec_voc)        0.03987    0.09434   0.423     0.67259    
scale(rec_voc):scale(phono_mem) -0.02249    0.09781  -0.230     0.81811    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3 Further investigations

3.1 Loading additional data

We load an additional table with participant-based rather than item-based data.

df_add_data <- read.table("additional_data.txt", header = TRUE, sep = "\t") %>% as_tibble()

We also load data for monolinguals for a comparison between them and bilinguals:

df_L1 <- read.table("Data_Mo_TD_Mo_SLI_all_items.txt", header = TRUE, sep = "\t") %>%
  as_tibble() %>%
  mutate(repetition = as.character(repetition), # To allow future binding of the corresponding column with the one from df
         group = str_replace_all(group, "MoSLI", "MoDLD")) # DLD was previously referred to as DLD

3.2 Correct repetition of specific structures within nonwords (by subject and for 3 different syllable structures)

We focused finally on the production of specific structures in nonwords, namely l in non-final coda position, l in final position and l in branching onsets, regardless of whether other parts of the nonword are fine.

We first prepare the data for a boxplot:

df_struct <- df_add_data %>%
  select(-PCC, -PVC, -NWR_pc_Total) %>%
  rename(`Coda /l/` = Coda_l_pc, `Branching onset` = Branching_onset_pc, `Final /l/` = Final_l_pc) %>%
  pivot_longer(cols = 2:4, names_to = "struct", values_to = "pc_correct_rep") %>%
  mutate(struct = as.factor(struct))

Then we draw the boxplot:

df_struct %>%
  ggplot(aes(x = struct, y = pc_correct_rep, fill = struct)) + 
  geom_boxplot(outlier.size = 3, show.legend = FALSE, colour = "black") +
  scale_fill_grey() +
  stat_summary(fun = mean, geom = "point", shape = 4, size = 4, stroke = 2, show.legend = FALSE) +
  geom_jitter(shape = 16, size = 1, position = position_jitter(0.075), show.legend = FALSE) +
  theme_classic() +
  ylab("Percentage of correct repetition") + xlab("") +
  scale_y_continuous(labels = scales::percent, breaks = seq(0.0, 1.0, 0.1))

The average values for the three distributions differ, at least the average for branching onsets seems higher than the other two. We can assess whether there are significant differences or not.

We first compute the average values explicitly:

df_struct %>% group_by(struct) %>% summarize(mean = mean(pc_correct_rep))
# A tibble: 3 x 2
  struct           mean
  <fct>           <dbl>
1 Branching onset 0.847
2 Coda /l/        0.754
3 Final /l/       0.726

We assess whether the distributions of percentages of correct repetition are normally distributed:

df_struct %>%
  ggplot(aes(sample = pc_correct_rep)) + 
  stat_qq() +
  stat_qq_line() +
  facet_grid(. ~ struct)

The distributions do not look normal, and we should therefore consider a Kruskal-Wallis test:

df_struct %>% kruskal.test(pc_correct_rep ~ struct, data = .)

    Kruskal-Wallis rank sum test

data:  pc_correct_rep by struct
Kruskal-Wallis chi-squared = 1.5272, df = 2, p-value = 0.466

There is no significant difference between the average values for the three distributions.

We can also observe that the variances of the three distributions look quite different. Is that indeed the case, i.e., are hey significantly different from each other?

We first compute the variances explicitly:

df_struct %>% group_by(struct) %>% summarize(var = var(pc_correct_rep))
# A tibble: 3 x 2
  struct             var
  <fct>            <dbl>
1 Branching onset 0.0123
2 Coda /l/        0.0953
3 Final /l/       0.0820

Given the non-normality of teh distributions, we consider the Fligner-Killeen non-parametric test of homogeneity of variances:

df_struct %>% fligner.test(pc_correct_rep ~ struct, data = .)

    Fligner-Killeen test of homogeneity of variances

data:  pc_correct_rep by struct
Fligner-Killeen:med chi-squared = 43.242, df = 2, p-value = 0.0000000004074

At least two variances are significantly different from each other. There are no post-hoc tests for the Fligner-Killeen test, but we can apply it to each pair of variances.

df_struct %>% filter(struct != "Branching onset") %>% fligner.test(pc_correct_rep ~ struct, data = .)

    Fligner-Killeen test of homogeneity of variances

data:  pc_correct_rep by struct
Fligner-Killeen:med chi-squared = 3.4613, df = 1, p-value = 0.06282
df_struct %>% filter(struct != "Final /l/") %>% fligner.test(pc_correct_rep ~ struct, data = .)

    Fligner-Killeen test of homogeneity of variances

data:  pc_correct_rep by struct
Fligner-Killeen:med chi-squared = 45.397, df = 1, p-value = 0.00000000001608
df_struct %>% filter(struct != "Coda /l/") %>% fligner.test(pc_correct_rep ~ struct, data = .)

    Fligner-Killeen test of homogeneity of variances

data:  pc_correct_rep by struct
Fligner-Killeen:med chi-squared = 27.055, df = 1, p-value = 0.0000001977

The problem here is that we need to adjust the p-values for multiple comparisons. We apply a Holm-Bonferroni correction:

p_bo <- df_struct %>% filter(struct != "Branching onset") %>% fligner.test(pc_correct_rep ~ struct, data = .) %>% .$p.value
p_final_l <- df_struct %>% filter(struct != "Final /l/") %>% fligner.test(pc_correct_rep ~ struct, data = .) %>% .$p.value
p_coda_l <- df_struct %>% filter(struct != "Coda /l/") %>% fligner.test(pc_correct_rep ~ struct, data = .) %>% .$p.value

p.adjust(c(p_bo, p_final_l, p_coda_l), method = "holm", n = 3)
[1] 0.06281975368431658 0.00000000004825425 0.00000039549925356

We find that the variance for the distribution of percentages of correct repetition for the branching onsets is higher than the variance for the distribution of percentages of correct repetition for l in final (p < 0.001). We also find that the variance for the distribution of percentages of correct repetition for the branching onsets is higher than the variance for the distribution of percentages of correct repetition for l in coda (p < 0.001).

3.3 By-subject ‘Percent of Vowels Correct’ and ‘Percent of Consonants Correct’

We first prepare the data for the figure:

df_pvc_pcc <- df_add_data %>%
  select(Participant, PCC, PVC) %>%
  pivot_longer(cols = PVC:PCC, names_to = "type", values_to = "score")

We then draw a boxplot to visually assess the situation:

df_pvc_pcc %>%
  mutate(type = as.factor(type),
         score =  score / 100) %>%
  ggplot(aes(x = type, y = score, fill = type)) + 
  geom_boxplot(outlier.size = 3, show.legend = FALSE, colour = "black") +
  scale_fill_grey() +
  stat_summary(fun = mean, geom = "point", shape = 4, size = 4, stroke = 2, show.legend = FALSE) +
  geom_jitter(shape = 16, size = 1, position = position_jitter(0.075), show.legend = FALSE) +
  theme_classic() +
  ylab("Percentage (of Vowels or Consonants) Correct") + xlab("") +
  scale_y_continuous(labels = scales::percent, breaks = seq(0.7, 1.0, 0.05)) +
  scale_x_discrete(labels = c('Consonants', 'Vowels'))

We can observe that the variances of the two distributions seem quite different from each other, but that the mean and median values seem close. We can compute them:

df_pvc_pcc %>% group_by(type) %>% summarize(var = var(score), mean = mean(score), median = median(score))
# A tibble: 2 x 4
  type    var  mean median
  <chr> <dbl> <dbl>  <dbl>
1 PCC    21.7  92.6     94
2 PVC    12.6  94.6     95

We can assess differences with inferential tests. To choose appropriate tests, we assess the normality of the distributions for PVC and PCC with a quantile-quantile plot:

df_pvc_pcc %>%
  mutate(type = as.factor(type),
         score =  score / 100) %>%
  ggplot(aes(sample = score)) + 
  stat_qq() +
  stat_qq_line() +
  facet_grid(. ~ type)

The distributions look rather far from normality. We therefore consider a Mann-Whitney U test and a Fligner-Killeen test of homogeneity of variances:

df_pvc_pcc %>% wilcox.test(score ~ type, data = .)

    Wilcoxon rank sum test with continuity correction

data:  score by type
W = 1462, p-value = 0.02104
alternative hypothesis: true location shift is not equal to 0

The two medians are significantly different from each other (p = 0.0210)

fligner.test(score ~ type, data = df_pvc_pcc)

    Fligner-Killeen test of homogeneity of variances

data:  score by type
Fligner-Killeen:med chi-squared = 4.1552, df = 1, p-value = 0.04151

The variances also appear to be significantly different from each other (p = 0.0415).

3.4 Comparing bilinguals to monolinguals with typical development or with DLD

3.4.1 Overview of monolinguals speakers

We can display an overview of the data for monolinguals

df_L1 %>%
  dfSummary() %>%
  print(method = 'render', footnote = NA) 

Data Frame Summary

df_L1
Dimensions: 2059 x 4
Duplicates: 0
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
1 subject [character]
1. ALL
2. AMO
3. ANB
4. AUC
5. CEM
6. CHD
7. CLK
8. CMU
9. ELV
10. GAV
[ 19 others ]
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
71(3.4%)
1349(65.5%)
2059 (100.0%) 0 (0.0%)
2 group [character]
1. MoDLD
2. MoTD
1207(58.6%)
852(41.4%)
2059 (100.0%) 0 (0.0%)
3 nonword [character]
1. faku
2. fal
3. fapul
4. fapus
5. fikapul
6. fikupla
7. fikuspa
8. filpa
9. fips
10. fiska
[ 61 others ]
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
29(1.4%)
1769(85.9%)
2059 (100.0%) 0 (0.0%)
4 repetition [character]
1. 0
2. 1
712(34.6%)
1347(65.4%)
2059 (100.0%) 0 (0.0%)

3.4.2 Combining data for L1 and L2 participants

We create a table with all the data for L1 and L2 participants:

df_all <- df %>%
  dplyr::select(subject, nonword, repetition) %>%
  mutate(group = "Bi") %>%
  mutate(repetition = as.character(repetition)) %>%  # To allow binding the corresponding column with the one from df_L1
  rbind(df_L1) %>%
  mutate(repetition = as.integer(repetition), # For now, we set repetition as an integer
         group = as.factor(group))

We can see that all the participants we consider have been evaluated with 71 nonwords:

df_all %>% group_by(group, subject) %>% summarize(nb_nonwords = n()) %>% print(n=100)
# A tibble: 91 x 3
# Groups:   group [3]
   group subject   nb_nonwords
   <fct> <chr>           <int>
 1 Bi    EANA-A1            71
 2 Bi    EANA-A2            71
 3 Bi    EANA-A3            71
 4 Bi    EANA-A4            71
 5 Bi    EANA-A5            71
 6 Bi    EANA-A6            71
 7 Bi    EANA-A7            71
 8 Bi    EANA-A8            71
 9 Bi    EANA-A9            71
10 Bi    EANA-AL1           71
11 Bi    EANA-AL2           71
12 Bi    EANA-AL3           71
13 Bi    EANA-AL4           71
14 Bi    EANA-AL5           71
15 Bi    EANA-AL6           71
16 Bi    EANA-AR1           71
17 Bi    EANA-AR2           71
18 Bi    EANA-AR3           71
19 Bi    EANA-E1            71
20 Bi    EANA-EA1           71
21 Bi    EANA-GA1           71
22 Bi    EANA-I1            71
23 Bi    EANA-I2            71
24 Bi    EANA-I3            71
25 Bi    EANA-I4            71
26 Bi    EANA-I5            71
27 Bi    EANA-IA1           71
28 Bi    EANA-IA10          71
29 Bi    EANA-IA11          71
30 Bi    EANA-IA12          71
31 Bi    EANA-IA2           71
32 Bi    EANA-IA3           71
33 Bi    EANA-IA4           71
34 Bi    EANA-IA5           71
35 Bi    EANA-IA6           71
36 Bi    EANA-IA7           71
37 Bi    EANA-IA8           71
38 Bi    EANA-IA9           71
39 Bi    EANA-J1            71
40 Bi    EANA-J2            71
41 Bi    EANA-K1            71
42 Bi    EANA-M1            71
43 Bi    EANA-P1            71
44 Bi    EANA-P2            71
45 Bi    EANA-P3            71
46 Bi    EANA-P4            71
47 Bi    EANA-P5            71
48 Bi    EANA-P6            71
49 Bi    EANA-P7            71
50 Bi    EANA-PB1           71
51 Bi    EANA-PB2           71
52 Bi    EANA-PB3           71
53 Bi    EANA-PB4           71
54 Bi    EANA-R1            71
55 Bi    EANA-R2            71
56 Bi    EANA-R3            71
57 Bi    EANA-R4            71
58 Bi    EANA-R5            71
59 Bi    EANA-RAR1          71
60 Bi    EANA-RU            71
61 Bi    EANA-RUT           71
62 Bi    EANA-UR1           71
63 MoDLD ALL                71
64 MoDLD AMO                71
65 MoDLD AUC                71
66 MoDLD CEM                71
67 MoDLD CHD                71
68 MoDLD CLK                71
69 MoDLD CMU                71
70 MoDLD ELV                71
71 MoDLD GAV                71
72 MoDLD LIB                71
73 MoDLD LOP                71
74 MoDLD MAL                71
75 MoDLD MAM                71
76 MoDLD MOB                71
77 MoDLD NBE                71
78 MoDLD PEN                71
79 MoDLD VAM                71
80 MoTD  ANB                71
81 MoTD  KYP                71
82 MoTD  LEB                71
83 MoTD  LIT                71
84 MoTD  MAD                71
85 MoTD  MAS                71
86 MoTD  MLI                71
87 MoTD  MLM                71
88 MoTD  OCS                71
89 MoTD  THP                71
90 MoTD  VAH                71
91 MoTD  WYR                71

We can find how many participants we have in each group:

df_all %>% dplyr::select(group, subject) %>% unique() %>% group_by(group) %>% summarize(nb_participants = n())
# A tibble: 3 x 2
  group nb_participants
  <fct>           <int>
1 Bi                 62
2 MoDLD              17
3 MoTD               12

We have many more bilingual participants, but this doesn’t prevent us from comparing the 3 groups.

3.4.3 Boxplot for the percentages of correct repetition (by subject) for bilinguals, monolinguals with TD and monolinguals with DLD

We compute the average percentage of correct repetition per subject:

df_group <- df_all %>%
  mutate(repetition = as.integer(repetition)) %>%
  group_by(group, subject) %>%
  summarize(av_pc_correct_repetition = mean(repetition)) %>%
  ungroup()

We then draw a boxplot:

df_group %>%
  ggplot(aes(x = group, y = av_pc_correct_repetition, fill = group)) + 
  geom_boxplot(outlier.size = 3,show.legend = FALSE, colour = "black") +
  scale_fill_grey() +
  stat_summary(fun = mean, geom = "point", shape = 4, size = 4, stroke = 2, show.legend = FALSE) +
  geom_jitter(shape = 16, size = 1, position = position_jitter(0.075), show.legend = FALSE) +
  theme_classic() +
  ylab("Percentage of correct repetition") + xlab("") +
  scale_y_continuous(labels = scales::percent, breaks = seq(0.0, 1.0, 0.05))

3.4.4 Analyzing group as a fixed effect in a mixed-effects logistic regression

We have three groups of participants we want to compare: monolinguals with typical development, monolinguals with specific language impairment, and bilinguals

We’re going to use a mixed-effects logistic regression to account for the non-independence of observations, i.e., their grouping by subject and nonword. av_pc_correct_repetition is the dependent variable and group the only independent variable. We keep things simple by considering random intercepts for subject and nonword, and by not considering other predictors.

3.4.4.1 Computing the model

model.comp <- glmer(repetition ~ group + (1 | subject) + (1 | nonword),
                    data = df_all, control = ctrl, family = binomial(link = "logit"))

The model converges without any problem.

3.4.4.2 Assessing the quality of the model

In order to trust the outputs of our model, we need to make sure that the assumptions underlying a mixed-effects logistic regression are met. While these assumptions are not as clear as for linear regression, the following conditions matter: - The normality of the different distributions for the random effects - The correct distribution of the residuals of the model

p <- plot_model(model.comp, type = "diag", show.values = TRUE, value.offset = .3)
p$nonword + labs(title = "Random intercept for nonword") + theme_sjplot()

p$subject + labs(title = "Random intercept for subject") + theme_sjplot()

The distributions for the random effect are close to normal, no problem.

sim.residuals <- simulateResiduals(fittedModel = model.comp, n = 1000, refit = FALSE)
sim.residuals %>% plotQQunif()

plotResiduals(sim.residuals, form = df_all$group, xlab = "Group of subjects", ylab = "Simulated Standardized Residuals")

The simulated residuals look very good. Additionally, the Levene test for the homogeneity of the variances of the three groups is non-significant, meaning the distributions are homoscedastic. We can therefore trust our model.

3.4.4.3 Analyzing the group fixed effect

We can now analyze the impact of the group predictor:

Rather than the average percentage of correct repetition per group, we can display the marginal means estimated with the regression model, i.e., accounting for the random effects:

plot_model(model.comp, type = "emm", terms = "group", show.values = TRUE, value.offset = .3) +
  labs(title = "Marginal estimated means for the different groups of subjects", x = "Group", y = "Predicted probability of correct repetition")

The marginal means for the three groups are more precisely:

emmeans(model.comp, "group", type = "response")
 group  prob     SE  df asymp.LCL asymp.UCL
 Bi    0.831 0.0226 Inf     0.782     0.871
 MoDLD 0.445 0.0651 Inf     0.323     0.573
 MoTD  0.941 0.0181 Inf     0.894     0.968

Confidence level used: 0.95 
Intervals are back-transformed from the logit scale 

We can assess whether differences are significant:

summary(emmeans(model.comp, pairwise ~ group, adjust = "mvt"), infer = c(TRUE, TRUE), null = 0)$contrasts
 contrast     estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 Bi - MoDLD       1.81 0.279 Inf      1.16      2.47   6.499  <.0001
 Bi - MoTD       -1.18 0.338 Inf     -1.97     -0.39  -3.482  0.0014
 MoDLD - MoTD    -2.99 0.399 Inf     -3.92     -2.06  -7.495  <.0001

Results are given on the log odds ratio (not the response) scale. 
Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 

We find significant differences for all pairs of groups, but bilingual subjects seem closer to monolinguals with typical development than to monolinguals with DLD. We should assess this further.

3.4.5 Non-parametric bootstrap test of the distances between bilingual subjects, monolinguals with TD and monolinguals with DLD

We can assess with a resampling method, more precisely a bootstrap test, whether the bilinguals’ average performance is significantly closer to the Mo-TD’s than to the Mo-DLD’s. Our alternative hypothesis is that bilinguals subjects are closer on average (in terms of performance during the NWR task) to monolinguals with TD than to monolinguals with DLD. The null hypothesis which we try to reject is that bilinguals subjects are at equidistance between Mo-TD and Mo-DLD, or closer to Mo-DLD.

We define a function which, given the distribution of by-subject repetition scores in our three groups, derives a resampled distribution and checks whether H0 is verified with this new distribution:

get_result_test <- function(i, df_subj) {
  results <- df_subj %>% group_by(group) %>% slice_sample(prop = 1.0, replace = TRUE) %>% summarize(av_perf = mean(av_perf)) %>% ungroup()
  v <- results %>% pull(av_perf)
  names(v) <- results %>% pull(group)

  test <- (v["Bi"] - v["MoDLD"]) <= (v["MoTD"] - v["Bi"])
  
  return (test)
}

We cal the prevision function 10,000 times to see how many times resampling leads to H0 being verified:

df_subj <- df_all %>% group_by(group, subject) %>% summarize(av_perf = mean(repetition)) %>% ungroup()
tests <- sapply(1:10000, get_result_test, df_subj = df_subj, simplify = T) %>% as.numeric()
sum(tests)
[1] 40

The number of times H0 is verified is very low and amounts to the following percentage of the 10,000 attempts:

sum(tests) / length(tests)
[1] 0.004

This percentage is equal to the rate of false positives, would we reject H0, and it is smaller than 0.05 (p = 0.004). We can therefore conclude that the bilinguals’ average performance is significantly closer to the Mo-TD’s than to the Mo-DLD’s.

3.5 By-subject z-scored receptive vocabulary size

We simply draw a boxplot to visually assess the situation:

df %>%
  group_by(subject) %>%
  summarize(rec_voc_zscore = mean(rec_voc_zscore)) %>%
  ungroup() %>%
  mutate(group = as.factor("participants")) %>%
  ggplot(aes(x = group, y = rec_voc_zscore)) + 
  geom_boxplot(outlier.size = 3, fill="#A4A4A4", coef = 1.5) +
  stat_summary(fun = mean, geom = "point", shape = 4, size = 4, stroke = 2) +
  geom_jitter(shape = 16, size = 1, position = position_jitter(0.075)) +
  theme_classic() +
  ylab("Receptive vocabulary size (z score)") + xlab("") +
  scale_y_continuous(breaks = seq(-30.0, 0.0, 5.0))